Skip to content

feat(governance): add codeql-reusable.yml — consolidate 263-repo codeql.yml drift#192

Merged
hyperpolymath merged 2 commits into
mainfrom
feat/codeql-reusable
May 26, 2026
Merged

feat(governance): add codeql-reusable.yml — consolidate 263-repo codeql.yml drift#192
hyperpolymath merged 2 commits into
mainfrom
feat/codeql-reusable

Conversation

@hyperpolymath
Copy link
Copy Markdown
Owner

Summary

Third foundational reusable in the workflow-convergence sweep (#168#174#187#190 → this). Targets codeql.yml, the 263-deployment CodeQL security-analysis workflow.

Drift survey

Full pagination of gh api /search/code against org:hyperpolymath, blob-SHA grouped:

Metric Value
Total deployments 263
Unique blob SHAs 69 (26% drift — same as mirror.yml)
Top 7 SHAs coverage 195/263 (74%)
Long-tail SHAs 62 SHAs / 68 repos

Language matrix distribution (key for design)

Languages Repos Share
javascript-typescript only 223 84.8%
actions only 22 8.4%
NONE (no matrix declared) 6 2.3%
rust only 3 1.1%
javascript-typescript,rust 3 1.1%
actions,javascript-typescript 3 1.1%
actions,javascript-typescript,rust 2 0.8%
actions,rust 1 0.4%

100% of estate variants use build-mode: none — verified across rust-only, actions-only, and mixed sampled variants.

Design choice — single-language single-job reusable

Caller invokes the reusable once per language. Multi-language wrappers (~3.4%) call it multiple times in parallel; per-language SARIF separation is preserved via the category: "/language:${{ inputs.language }}" step.

This matches how callers already think about CodeQL (one job per language) without forcing a JSON-array input or matrix-as-string-input. The alternative (matrix-as-input) would have made the 85% single-language case more awkward.

Inputs

  • language (string, default javascript-typescript) — single CodeQL language identifier
  • build-mode (string, default none) — 100% of estate currently uses none; default covers everything
  • runs-on (string, default ubuntu-latest)

Caller wrapper examples

Single-language (~85% of estate):

jobs:
  codeql:
    uses: hyperpolymath/standards/.github/workflows/codeql-reusable.yml@<sha>

~5 lines, replacing ~49.

Rust-only or actions-only (~10% of estate):

jobs:
  codeql:
    uses: hyperpolymath/standards/.github/workflows/codeql-reusable.yml@<sha>
    with:
      language: rust

Multi-language (~3.4% of estate):

jobs:
  codeql-js:
    uses: hyperpolymath/standards/.github/workflows/codeql-reusable.yml@<sha>
  codeql-actions:
    uses: hyperpolymath/standards/.github/workflows/codeql-reusable.yml@<sha>
    with:
      language: actions
  codeql-rust:
    uses: hyperpolymath/standards/.github/workflows/codeql-reusable.yml@<sha>
    with:
      language: rust

Rollout plan

NOT started in this PR — owner-gated.

Wave Repos Action
1: bulk-mechanical ~210 Single-language javascript-typescript default. One-line wrapper.
2: single non-default ~25 Override language: rust or language: actions.
3: multi-language ~9 Two or three reusable invocations per wrapper.
4: NEEDS_REVIEW ~18 NONE matrix (6) + 100-line custom workflows (~2-3). Per-repo review.

Total expected sweep: ~245 PRs (93% mechanical).

Pattern hardening

🤖 Generated with Claude Code

…ql.yml drift

Extends the #168/#174/#187/#190 reusable-workflow pattern to codeql.yml,
the third foundational security workflow in the convergence sweep.

Drift survey (gh api /search/code paginated over org:hyperpolymath,
blob-SHA grouped):
- 263 deployments, 69 unique blob SHAs (26% drift)
- Top 7 SHAs cover 195/263 (74%); long tail of 62 SHAs covers 68 repos

Language matrix distribution (key for the reusable design):
- 223 (84.8%) javascript-typescript only
-  22  (8.4%) actions only
-   6  (2.3%) NONE (no matrix declared — needs per-repo review)
-   3  (1.1%) rust only
-   3  (1.1%) javascript-typescript,rust
-   3  (1.1%) actions,javascript-typescript
-   2  (0.8%) actions,javascript-typescript,rust
-   1  (0.4%) actions,rust

100% of estate variants currently use `build-mode: none`.

Design choice — single-language single-job reusable (vs matrix-as-input):
- Single-language wrappers (~85%) call the reusable once with defaults.
- Multi-language wrappers (~3.4%) call the reusable once per language
  in parallel; per-language SARIF separation preserved via the
  `category: "/language:${{ inputs.language }}"` field.

This pattern matches how callers already think about CodeQL (one job
per language) without forcing them to pass JSON-array inputs.

Inputs:
- language (string, default `javascript-typescript`)
- build-mode (string, default `none`)
- runs-on (string, default `ubuntu-latest`)

Sweep classification (preview):
- TRIVIAL (~210): single javascript-typescript, default wrapper
- Single-language non-default (~25): rust or actions, override language
- Multi-language (~9): wrapper invokes reusable per-language
- NEEDS_REVIEW (~18): NONE matrix or non-canonical custom workflow

After merge, ~93% of 263 wrappers are mechanical conversions.
The PR was opened with auto-merge ON 4h ago but no workflow runs ever
fired against the head commit. The required-checks gate cannot be
satisfied without CI runs, so the PR cannot auto-merge. Pushing this
empty commit to re-trigger workflows.
hyperpolymath added a commit that referenced this pull request May 26, 2026
Same as #192 (codeql-reusable) — auto-merge enabled but zero workflow
runs against the head commit. Pushing empty commit to re-trigger CI.
hyperpolymath added a commit that referenced this pull request May 26, 2026
Same as #192 (codeql-reusable) — auto-merge enabled but zero workflow
runs against the head commit. Pushing empty commit to re-trigger CI.
hyperpolymath added a commit that referenced this pull request May 26, 2026
…ergence set (#205)

## Summary

5th and final reusable in the workflow convergence campaign (see #199
for the meta-doc). Consolidates the per-repo `scorecard.yml` workflow.

## Drift signal (full pagination + per-repo verified)

- **258** top-level estate deployments
- **626** nested copies in monorepos (asdf-tool-plugins,
developer-ecosystem, ssg-collection, standards, ambientops,
julia-ecosystem, etc. — Layer-2 truncation discovery via #204's helper)
- **46** unique blob SHAs / 17.8% structural drift
- Top SHA covers **100/258 (38.8%)** — highest dominant-cluster of the 5
campaigns
- Top 7 SHAs cover ~80%
- **100% mechanical drift, ZERO feature variance** — SPDX header
(PMPL-1.0 / PMPL-1.0-or-later / MPL-2.0), `upload-sarif` SHA-pin churn,
`permissions: read-all` vs `contents: read` wording

## Design

- One input: `runs-on` (default ubuntu-latest)
- No `secrets: inherit` — Scorecard uses `GITHUB_TOKEN` directly
- Caller MUST grant `security-events: write` + `id-token: write` on the
calling job (called-workflow permissions are capped by caller)
- Caller keeps own `on:` triggers + `concurrency:` group

## Per Layer-3 caveat from the campaign meta-doc

Nested workflows are inert — GitHub Actions only runs
`.github/workflows/` at the repo root. Sweeping the 626 nested copies is
single-source-of-truth cleanup, not security hardening.

## Campaign convergence set (closes with this PR)

| PR | Template |
|---|---|
| #187 | mirror-reusable.yml |
| #190 | secret-scanner-reusable.yml |
| #192 | codeql-reusable.yml |
| #193 | hypatia-scan-reusable.yml |
| #194 | sweep-classifier scripts |
| #199 | campaign meta-doc |
| #204 | list-workflow-paths.sh (bypass /search/code undercount) |
| **this** | **scorecard-reusable.yml** |

## Test plan

- [ ] Wrapper sweep (~258 top-level + ~626 nested) — owner-gated; not
part of this PR
- [ ] Update classify-* scripts to consume helper TSV — follow-up

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@github-actions
Copy link
Copy Markdown

🔍 Hypatia Security Scan

Findings: 117 issues detected

Severity Count
🔴 Critical 64
🟠 High 43
🟡 Medium 10

⚠️ Action Required: Critical security issues found!

View findings
[
  {
    "reason": "Action hyperpolymath/standards/.github/workflows/deno-ci-reusable.yml@main needs attention",
    "type": "unpinned_action",
    "file": "deno-ci-reusable.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
    "type": "unpinned_action",
    "file": "governance-reusable.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Action hyperpolymath/standards/.github/workflows/governance-reusable.yml@main needs attention",
    "type": "unpinned_action",
    "file": "governance.yml",
    "action": "pin_sha",
    "rule_module": "workflow_audit",
    "severity": "high"
  },
  {
    "reason": "Python file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/standards/standards/a2ml-templates/state-scm-to-v2.py",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/standards/standards/a2ml/bindings/deno/mod.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/standards/standards/lol/test/vitest.config.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "TypeScript file detected -- banned language",
    "type": "banned_language_file",
    "file": "/home/runner/work/standards/standards/k9-svc/bindings/deno/mod.ts",
    "action": "flag",
    "rule_module": "cicd_rules",
    "severity": "critical"
  },
  {
    "reason": "Agda postulate assumes without proof -- potential soundness hole (4 occurrences, CWE-704)",
    "type": "agda_postulate",
    "file": "/home/runner/work/standards/standards/lol/proofs/theories/information_theory.agda",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "believe_me undermines formal verification (1 occurrences, CWE-704)",
    "type": "believe_me",
    "file": "/home/runner/work/standards/standards/lol/src/abi/Locale.idr",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "critical"
  },
  {
    "reason": "Wildcard CORS -- restrict to specific origins or use env var (1 occurrences, CWE-942)",
    "type": "js_wildcard_cors",
    "file": "/home/runner/work/standards/standards/consent-aware-http/examples/reference-implementations/deno/aibdp_middleware.js",
    "action": "flag",
    "rule_module": "code_safety",
    "severity": "high"
  }
]

Powered by Hypatia Neurosymbolic CI/CD Intelligence

hyperpolymath added a commit that referenced this pull request May 26, 2026
## Summary

Estate-wide audit doc for the workflow convergence campaign just
completed (5 reusable-extraction PRs, methodology gotcha discovery,
parallel-session reconciliation).

- 4 reusables already filed: #187 (mirror), #190 (secret-scanner), #192
(codeql), #193 (hypatia-scan)
- Classifier tooling shipped: #194
- Scorecard-reusable recommended as #195 (cleanest sweep target — zero
nested copies, 38.8% top-SHA dominance, 100% mechanical drift)

## Key findings documented

1. **Nested-path methodology gotcha** (3-layer):
- `path:.github/workflows` filter excludes nested workflows (mirror
+133, secret-scanner +282, codeql +518 verified per-repo, hypatia-scan
+449)
- Even broad `filename:` queries are org-scope-truncated; per-repo
recursive-tree queries are the only reliable source of truth
- **Nested workflows do NOT execute** — GitHub Actions only runs
`.github/workflows/` at the repo root. Nested copies are inert vendored
templates or stale leftover, not active deployments

2. **Corrected estate counts** (top-level + nested true totals):
   - hypatia-scan: 704 wrapper sites (×416 LOC ≈ 280k retired)
- codeql: ~781 wrapper sites (×41 LOC ≈ 32k retired — 3× original
estimate)
   - secret-scanner: 563 wrapper sites
   - mirror: 422 wrapper sites
   - scorecard: 258 (no nesting)

3. **Parallel-session ranking divergence** root-caused — sample-drift
methodology missed homogeneity findings for secret-scanner / codeql /
scorecard.

## Test plan

- [ ] Validate cross-refs render (`/.claude/CLAUDE.md`, sibling audit
docs)
- [ ] No breakage — additive doc only

## Standing follow-ups (documented in body)

1. Owner-review the 5 open campaign PRs
2. File #195 (scorecard-reusable) to close the set
3. Extend classifiers in #194 to emit per-(repo, path) tuples for
nested-copy sweeps
4. Wrapper sweep firing remains owner-gated per template

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@hyperpolymath hyperpolymath merged commit ccf01cd into main May 26, 2026
18 checks passed
@hyperpolymath hyperpolymath deleted the feat/codeql-reusable branch May 26, 2026 19:33
hyperpolymath added a commit that referenced this pull request May 26, 2026
…e of the reusable trilogy (#193)

## Summary

Fourth and largest-leverage reusable in the workflow-convergence
campaign (#168#174#187#190#192 → this). Targets
`hypatia-scan.yml`, the **416-line** estate-wide Hypatia neurosymbolic
security scanner.

### Drift survey

Full pagination of `gh api /search/code` against `org:hyperpolymath`:

| Metric | Value |
|---|---|
| Total deployments | **255** |
| Unique blob SHAs | **30** |
| Structural drift | **11.8% — lowest of all 5 surveyed templates** |
| Top 5 SHAs coverage | **213/255 (83.5%)** |
| Top-SHA share | 100 repos (39.2%) |

### Feature variance: zero

Sampled top 7 + long-tail 10 variants — **every single one carries
exactly one `scan` job**. Line counts range 207-416, but this is pure
propagation lag: older repos run an earlier slimmer version of the same
monolithic job; newer repos run the 413-416-line current canonical.

No customisation, no per-repo extras, no missing jobs in the long tail.

### Leverage — biggest of the convergence campaign

| PR | Per-repo lines | Wrappers | LOC retired |
|---|---|---|---|
| #187 mirror | 145 → 12 | ~267 | ~35,500 |
| #190 secret-scanner | ~80 → 12 | ~275 | ~19,000 |
| #192 codeql | 49 → 5 | ~245 | ~10,800 |
| **#193 hypatia-scan** | **416 → 16** | **~235** | **~94,000** |

This single PR's downstream sweep retires more workflow LOC than #187 +
#190 + #192 combined.

### Design

**Zero inputs except `runs-on`.** The scan job body is byte-identical to
the canonical `hypatia-scan.yml` — no per-repo values to parameterise;
every interpolation is `${{ github.* }}` or `${{ secrets.* }}` which
resolve in the caller context.

**Caller MUST:**
- Use `secrets: inherit` — so `GITHUB_TOKEN` + `HYPATIA_DISPATCH_PAT`
flow through. Without inherit, the Phase-2 gitbot-fleet submission step
silently no-ops (it's `continue-on-error`-guarded), and the
DependabotAlerts rule loses read access (HTTP 403).
- Grant `contents: read` + `security-events: write` + `pull-requests:
write` at the call-site permissions block. **Called-workflow permissions
are CAPPED by caller** — without `security-events: write` at the call
site, SARIF upload to Security → Code scanning silently fails.

### Caller wrapper shape (post-merge)

```yaml
# SPDX-License-Identifier: PMPL-1.0-or-later
name: Hypatia Security Scan
on:
  push:
    branches: [ main, master, develop ]
  pull_request:
    branches: [ main, master ]
  schedule:
    - cron: '0 0 * * 0'
  workflow_dispatch:
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true
permissions:
  contents: read
  security-events: write
  pull-requests: write
jobs:
  scan:
    uses: hyperpolymath/standards/.github/workflows/hypatia-scan-reusable.yml@<sha>
    secrets: inherit
```

~16 lines per repo, replacing ~250-416 lines (depending on which
propagation snapshot the repo currently carries).

### Rollout plan

**NOT started in this PR — owner-gated, same as #187 / #190 / #192.**

| Wave | Repos | Action |
|---|---|---|
| 1: current-canonical | ~211 | Top 6 SHAs at 413-416 lines. Pure
mechanical wrapper. |
| 2: standardize-up older | ~24 | 207-253-line older versions; convert
to same wrapper — body of the scan job auto-upgrades on next workflow
run. |
| 3: per-repo review | ~2 | `hypatia` repo itself (345-line — likely a
development snapshot), `standards` (the canonical source). Exclude from
sweep. |

Total expected sweep: **~235 mechanical wrappers** (92.2%) + 2 excluded
+ minor per-repo review.

### Pattern hardening

- Same `workflow_call` shape as #168 / #174 / #187 / #190 / #192 — no
new infrastructure.
- Independent of all open standards PRs — lands in any order.

### Note on parallel-session count discrepancy

A separate session's audit memory
([[project_foundational_workflow_survey_2026_05_26]]) recorded
"hypatia-scan: 702 copies × 416 lines × HIGH homogeneity". My
methodology (`gh api /search/code` paginated with
`filename:hypatia-scan.yml path:.github/workflows org:hyperpolymath`)
returns **255**. The HIGH homogeneity finding agrees in both surveys;
the 702 figure is likely scheduled-run count or includes historical
branches. Either way the leverage doesn't change much — this is the
biggest single LOC removal in the campaign.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant